On Sequential Elimination Algorithms for Best-Arm Identification in Multi-Armed Bandits

نویسندگان
چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Best Arm Identification in Multi-Armed Bandits

We consider the problem of finding the best arm in a stochastic multi-armed bandit game. The regret of a forecaster is here defined by the gap between the mean reward of the optimal arm and the mean reward of the ultimately chosen arm. We propose a highly exploring UCB policy and a new algorithm based on successive rejects. We show that these algorithms are essentially optimal since their regre...

متن کامل

Best arm identification in multi-armed bandits with delayed feedback

We propose a generalization of the best arm identification problem in stochastic multiarmed bandits (MAB) to the setting where every pull of an arm is associated with delayed feedback. The delay in feedback increases the effective sample complexity of standard algorithms, but can be offset if we have access to partial feedback received before a pull is completed. We propose a general framework ...

متن کامل

Practical Algorithms for Best-K Identification in Multi-Armed Bandits

In the Best-K identification problem (Best-K-Arm), we are given N stochastic bandit arms with unknown reward distributions. Our goal is to identify the K arms with the largest means with high confidence, by drawing samples from the arms adaptively. This problem is motivated by various practical applications and has attracted considerable attention in the past decade. In this paper, we propose n...

متن کامل

Best Arm Identification for Contaminated Bandits

This paper studies active learning in the context of robust statistics. Specifically, we propose the Contaminated Best Arm Identification variant of the multi-armed bandit problem, in which every arm pull has probability ε of generating a sample from an arbitrary contamination distribution instead of the true underlying distribution. The goal is to identify the best (or approximately best) true...

متن کامل

Algorithms for Differentially Private Multi-Armed Bandits

We present differentially private algorithms for the stochastic Multi-Armed Bandit (MAB) problem. This is a problem for applications such as adaptive clinical trials, experiment design, and user-targeted advertising where private information is connected to individual rewards. Our major contribution is to show that there exist (ǫ, δ) differentially private variants of Upper Confidence Bound alg...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Transactions on Signal Processing

سال: 2017

ISSN: 1053-587X,1941-0476

DOI: 10.1109/tsp.2017.2706192